Integrate Syntactic Information into Chinese Word Sense Disambiguation Technology

نویسندگان

  • Chun-Xiang Zhang
  • Bo Luan
چکیده

Word sense disambiguation (WSD) is important to many application problems in natural language processing fields. It plays a very important role in information retrieval systems, machine translation systems, text classification systems and automatic summarization systems. Its task is to automatically choose the intended sense of an ambiguous word in a given context. In this paper, a new supervised word sense disambiguation method is proposed, where syntactic information is introduced. The parsing tree of its context including the ambiguous word is built. Then, the parsing tree is traveled and disambiguation features are extracted including parsing information, part of speech information and word information. The bayesian model is used to build word sense disambiguation classifier for every Chinese ambiguous word. Experimental results show that accuracy rate of disambiguation arrives at 60% after the classifier is applied to test data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SRCB-WSD: Supervised Chinese Word Sense Disambiguation with Key Features

This article describes the implementation of Word Sense Disambiguation system that participated in the SemEval-2007 multilingual Chinese-English lexical sample task. We adopted a supervised learning approach with Maximum Entropy classifier. The features used were neighboring words and their part-of-speech, as well as single words in the context, and other syntactic features based on shallow par...

متن کامل

Using Verb Subcategorization for Word Sense Disambiguation

We develop a model for predicting verb sense from subcategorization information and integrate it into SSI-Dijkstra, a wide-coverage knowledge-based WSD algorithm. Adding syntactic knowledge in this way should correct the current poor performance of WSD systems on verbs. This paper also presents, for the first time, an evaluation of SSI-Dijkstra on a standard data set which enables a comparison ...

متن کامل

Probabilistic Coordination Disambiguation in a Fully-Lexicalized Japanese Parser

This paper describes a probabilistic model for coordination disambiguation integrated into syntactic and case structure analysis. Our model probabilistically assesses the parallelism of a candidate coordinate structure using syntactic/semantic similarities and cooccurrence statistics. We integrate these probabilities into the framework of fully-lexicalized parsing based on largescale case frame...

متن کامل

Enriching EWN with Syntagmatic Information by Means of WSD

Word Sense Disambiguation confronts with the lack of syntagmatic information associated to word senses. In the present work we propose a method for the enrichment of EuroWordNet with syntagmatic information, by means of the WSD process itself. We consider that an ambiguous occurrence drastically reduces its ambiguity when considered together with the words it establishes syntactic relations in ...

متن کامل

Combining a Chinese Thesaurus with a Chinese Dictionary

Abs t rac t In this paper, we study the problem of combining a Chinese thesaurus with a Chinese dictionary by linking the word entries in the thesaurus with the word senses in the dictionary, and propose a similar word strategy to solve the problem. The method is based on the definitions given in the dictionary, but without any syntactic parsing or sense disambiguation on them at all. As a resu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013